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Abstract. We propose a method to determine the critical noise level for 
decoding Gallager type low density parity check error correcting codes. 
The method is based on the magnetization enumerator (M), rather than 
on the weight enumerator (W) presented recently in the information 
theory literature. The interpretation of our method is appealingly simple, 
and the relation between the different decoding schemes such as typical 
pairs decoding, MAP, and finite temperature decoding (MPM) becomes 
clear. Our results are more optimistic than those derived via the methods 
of information theory and are in excellent agreement with recent results 
from another statistical physics approach. 



1 Introduction 

Triggered by active investigations on error correcting codes in both of informa- 
tion theory (IT) (jEH and statistical physics (SP) communities, there is a 
growing interest in the relationship between IT and SP. As the two communities 
investigate similar problems, one may expect that standard techniques known 
in one framework would bring about new developments in the other, and vice 
versa. Here we present a direct SP method to determine the critical noise level 
for Gallager type low density parity check codes which allows us to focus on the 
differences between the various decoding criteria and their approach for defin- 
ing the critical noise level for which decoding, using Low Density Parity Check 
(LDPC) codes, is theoretically feasible. 



2 Gallager code 

In a general scenario, the N dimensional Boolean message s° £ {0, 1}^ is en- 
coded to the M(> N) dimensional Boolean vector t°, and transmitted via a 
noisy channel, which is taken here to be a Binary Symmetric Channel (BSC) 
characterized by an independent flip probability p per bit; other transmission 
channels may also be examined within a similar framework. At the other end of 
the channel, the corrupted codeword is decoded utilizing the structured code- 
word redundancy. 



The error correcting code that we focus on here is Gallager's linear code g. 
Gallager's code is a low density parity check code defined by the a binary (M — 
N) xM matrix A = [C1IC2], concatenating two very sparse matrices known to 
both sender and receiver, with the (M-N)x(M-N) matrix C2 being invertible. 
The matrix A has K non-zero elements per row and C per column, and the 
code rate is given by R=l—C/K=l—N/M. Encoding refers to multiplying the 
original message s° with the (MxN) matrix G T (where G = [HatIC^ 1 ]), yielding 
the transmitted vector t°. Note that all operations are carried out in (mod 2) 
arithmetic. Upon sending t° through the binary symmetric channel (BSC) with 
noise level p, the vector r = t°+n° is received, where n° is the true noise. 

Decoding is carried out by multiplying r by A to produce the syndrome 
vector z = Ar (= An", since AG T = 0). In order to reconstruct the original 
message s°, one has to obtain an estimate n for the true noise n° . First we select 
all n that satisfy the parity checks An = An": 

2p C (A,n°) = {n \ An = z), and Xp C (A, n°) = {n e l pc {A, n°)\n ± n°}, (1) 

the (restricted) parity check set. Any general decoding scheme then consists 
of selecting a vector n* from X pc (A, n°) on the basis of some noise statistics 
criterion. Upon successful decoding n° will be selected, while a decoding error 
is declared when a vector n* G 2£ c (A,n°) is selected. An measure for the error 
probability is usually defined in the information theory literature as 

P e (p) = {A(3ne Z pc (A, n°) : w(n) < w(n°) \ n°) ) An „ , (2) 

where A(-) is an indicator function returning 1 if there exists a vector n G 
l pc (A, n°) with lower weight than that of the given noise vector n°. The weight 

of a vector is the average sum of its components w(n) = YljLi n j- To obtain 
the error probability, one averages the indicator function over all n° vectors 
drawn from some distribution and the code ensemble A as denoted by {-) An o- 

Carrying out averages over the indicator function is difficult, and the error 
probability (^) is therefore upper-bounded by averaging over the number of vec- 
tors n obeying the weight condition w(n) > w(n°). Alternatively, one can find 
the average number of vectors with a given weight value w from which one can 
construct a complete weight distribution of noise vectors n in Zp C (A,n°). From 
this distribution one can, in principle, calculate a bound for P e and derive critical 
noise values above which successful decoding cannot be carried out. 

A natural and direct measure for the average number of states is the entropy 
of a system under the restrictions described above, that can be calculated via 
the methods of statistical physics. 

It was previously shown (see e.g. [Q for technical details) that this prob- 
lem can be cast into a statistical mechanics formulation, by replacing the field 
({0, 1}, +mod(2)) by ({1, —1}, x), and by adapting the parity checks correspond- 
ingly. The statistics of a noise vector n is now described by its magnetization 
m(n) = jj^2jLi n j> { m ( n ) G [1 , — 1] ) , which is inversely linked to the vector 
weight in the [0, 1] representation. With this in mind, we introduce the con- 
ditioned magnetization enumerator, for a given code and noise, measuring the 



noise vector magnetization distribution in 2^ c (A,n°) 



M.A,n°{m) = — In 



Tr 5(m(n) — m) 

nGl^ c (A,Ti°) 



To obtain the magnetization enumerator M. (to) 



M(m) = ( M A . n a(m) ) 

\ /A,' 



(3) 



(4) 



which is the entropy of the noise vectors in Ip C (A, n ) with a given m, one carries 
out uniform explicit averages over all codes A with given parameters K, C, and 
weighted average over all possible noise vectors generated by the BSC, i.e., 



P{n°) 



M 



P ) 6(^-1) + P S(n° j +l)) 



(5) 



It is important to note that, in calculating the entropy, the average quantity 
of interest is the magnetization enumerator rather than the actual number of 
states. For physicists, this is the natural way to carry out the averages due to 
three main reasons: a) The entropy obtained in this way is believed to be self- 
averaging, i.e., its average value (over the disorder) coincides with its typical 
value, b) This quantity is extensive and grows linearly with the system size, c) 
This averaging distinguishes between annealed variables that are averaged or 
summed for a given set of quenched variables, that are averaged over later on. 
In this particular case, summation over all n vectors is carried for a fixed choice 
of code A and noise vector n°; averages over these variables are carried out at 
the next level. 

One should point out that in somewhat similar calculations, we showed that 
this method of carrying out the averages provides more accurate results in com- 
parison to averaging over both sets of variables simultaneously || . 

A positive magnetization enumerator, _A/f(m) > indicates that there is an 
exponential number of solutions (in M) with magnetization to, for typically 
chosen A and n°, while A4(m) — > indicates that this number vanishes as 
M — > oo (note that negative entropy is unphysical in discrete systems) . 

Another important indicator for successful decoding is the overlap u be- 
tween the selected estimate n*, and the true noise n°: u>(n, n°) = A Xw=i 



(uj(n,n°) G [—1, 1]), with uj — 1 for successful (perfect) decoding. However, this 
quantity cannot be used for decoding as n° is unknown to the receiver. The 
(code and noise dependent) overlap enumerator is now defined as: 



Tr 5(w(n, n°) -u) 

n6lJ c (A,n°) 



and the average quantity being 

W(w) = ( Wa,««(w)\ 

\ / A.< 



(6) 



(7) 



This measure is directly linked to the weight enumerator ||] ) , although according 
to our notation, averages are carried out distinguishing between annealed and 
quenched variables unlike the common definition in the IT literature. However, 
as we will show below, the two types of averages provide identical results in this 
particular case. 



3 The statistical physics approach 

Quantities of the type Q(c) = (Q y {c)) y , with Q y (c) = jj\n[Z y (c)] and Z y (c) = 
Tr^ S(c(x, y)—Mc), are very common in the SP of disordered systems; the macro- 
scopic order parameter c(x, y) is fixed to a specific value and may depend both on 
the disorder y and on the microscopic variables x. Although we will not prove this 
here, such a quantity is generally believed to be self- averaging in the large system 
limit, i.e., obeying a probability distribution P (Q y (c)) = S(Q y (c) — Q(c))). The 
direct calculation of Q(c) is known as a quenched average over the disorder, but is 
typically hard to carry out and requires using the replica method || . The replica 
method makes use of the identity (InZ) = ( lim n ^ [Z n — 1] / n ), by calculating 
averages over a product of partition function replicas. Employing assumptions 
about replica symmetries and analytically continuing the variable n to zero, one 
obtains solutions which enable one to determine the state of the system. 

To simplify the calculation, one often employs the so-called annealed approxi- 
mation, which consists of performing an average over Q y (c) first, followed by the 
logarithm operation. This avoids the replica method and provides (through the 
convexity of the logarithm function) an upper bound to the quenched quantity: 

Q a{c) ^j- H{Zy (c)) y } > Q q (c) ^ ± (ln[Z v (c)]) y = jfa {Zy ^~ 1 ■ (8) 

The full technical details of the calculation will be presented elsewhere, and 
those of a very similar calculation can be found in e.g. Q. It turns out that it 
is useful to perform the gauge transformation nj —>njn°, such that the averages 
over the code A and noise n° can be separated, Wa.™° becomes independent of 
n°, leading to an equality between the quenched and annealed results, W(m) = 
M a (jn)\p=o — A / fg(m)| p= o- For any finite noise value p one should multiply 
exp[yV(u)] by the probability that a state obeys all parity checks exp[— K (u>,p)] 
given an overlap u and a noise level p ||. In calculating W(lo) and M a / q {m), 
the 5- functions fixing m and to, are enforced by introducing Lagrange multipliers 
ra and Co. 

Carrying out the averages explicitly one then employs the saddle point method 
to extremize the averaged quantity with respect to the parameters introduced 
while carrying out the calculation. These lead, in both quenched and annealed 
calculations, to a set of saddle point equations that are solved either analytically 
or numerically to obtain the final expression for the averaged quantity (entropy) . 



The final expressions for the annealed entropy, under both overlap (u>) and 
magnetization (m) constraints, are of the form: 



Ga = -§(M2) + (tf-l)Ml + 3?))- 



-ln 



Tr ^ nC ^ n °\l+nq^- x ) 

n— ±1 



(9) 



where q 1 has to be obtained from the saddle point equation = 0. Similarly, 
the final expression in the quenched calculation, employing the simplest replica 
symmetry assumption M, is of the form: 



—cJdxdxTr(x)7r(x) ln[l + xx} + — J < J^J dx^Tx{X},) > In 





Y\_ dx c Tr(x c ) 



Tr exp(n(a>+mn°)) I I (l+nx c ) 



c=l 



(10) 



The probability distributions 7r(x) and 7r(x) emerge from the calculation; the 
former represents a probability distribution with respect to the noise vector local 
magnetization [0, while the latter relates to a field of conjugate variables which 
emerge from the introduction of (^-functions while carrying out the averages (for 
details see B). Their explicit forms are obtained from the functional saddle 



point equations 



SQ q 



S7r(x) j Stt(x] 

a 5-function corresponds to taking Lu,rh such that 



0, and all integrals are from — 1 to 1. Enforcing 

9Q a/q dQ a/q 



= 0, while not 

9Qa/ q 



dui ' drh 

enforcing it corresponds to putting Cj, m to 0. Since w, m follow from ^ 

9Qa/ q _ 

drh 

of Cj, m. 



= 0, 



0, all the relevant quantities can be recovered with appropriate choices 



4 Qualitative picture 

We now discuss the qualitative behaviour of M(m), and the interpretation of 
the various decoding schemes. To obtain separate results for M(m) and W(m) 



we calculate the results of Eqs.(pT) and (10), corresponding to the annealed and 



quenched cases respectively, setting u> = for obtaining M.(m) and m = 
for obtaining W(lo) (that becomes W(m) after gauging). In Fig. [l], we have 
qualitatively plotted the resulting function A4(m) for relevant values of p. M(m) 
(solid line) only takes positive values in the interval [iri-{p), m+(p)}; for even K, 
A4(m) is an even function of m and m-(p) = —m^(jj). The maximum value of 
M(m) is always (1 — i?)ln(2). The true noise n° has (with probability 1) the 
typical magnetization of the BSC: m(n°) — m a (p) — l — 2p (dashed-dotted line). 
The various decoding schemes can be summarized as follows: 

Maximum likelihood (MAP) decoding - minimizes the block error 
probability jll| and consists of selecting the n from X pc (A,n°) with the 



highest magnetization. Since the probability of error below m+(jj) vanishes, 
P(3n e J7 VC : m(n) > m+(p)) = 0, and since P(m(n°) = m (p)) = 1, the 
critical noise level p c is determined by the condition m+(p c ) — mo(p c ). The 
selection process is explained in Fig.|l|(a)-(c). 

— Typical pairs decoding - is based on randomly selecting a n from T pc 
with m(n) = mo(p) ||; an error is declared when n° is not the only element 
of I pc . For the same reason as above, the critical noise level p c is determined 
by the condition m+(p c ) = mo(p c ) . 

— Finite temperature (MPM) decoding - An energy —Fm(n) (with F = 
i ln(i^)) according to Nishimori's condition^] is attributed to each n € X pc , 
and a solution is chosen from those with the magnetization that minimizes 
the free energy This procedure is known to minimize the bit error proba- 
bility . Using the thermodynamic relation T = U—jjS, (3 being the inverse 
temperature (Nishimori's condition corresponds to setting (3 = 1), the free 
energy of the sub-optimal solutions is given by T(m) — ~Fm — ^M(m) (for 
M(m) >0), while that of the correct solution is given by —Fmo(p) (its en- 
tropy being 0). The selection process is explained graphically in Fig.[l](a)-(c). 
The free energy differences between sub-optimal solutions relative to that of 
the correct solution in the current plots, are given by the orthogonal distance 
between A4(m) and the line with slope — (3F through the point (mo(p),0). 
Solutions with a magnetization m for which Ai(m) lies above this line, have 
a lower free energy, while those for which A4(m) lies below, have a higher 
free energy. Since negative entropy values are unphysical in discrete sys- 
tems, only sub-optimal solutions with M(m)>0 are considered. The lowest 
p value for which there are sub-optimal solutions with a free energy equal 
to —Fuio(p) is the critical noise level p c for MPM decoding. In fact, using 
the convexity of A4(m) and Nishimori's condition, one can show that the 
slope dA4(m) / dm > — (3F for any value m < m (p) and any p, and equals 
— (3F only at m — m (p); therefore, the critical noise level for MPM decoding 
p=p c is identical to that of MAP, in agreement with results obtained in the 
information theory community |l2[ . 

The statistical physics interpretation of finite temperature decoding corre- 
sponds to making the specific choice for the Lagrange multiplier rh = [3F and 
considering the free energy instead of the entropy. In earlier work on MPM 
decoding in the SP framework negative entropy values were treated by 
adopting different replica symmetry assumptions, which effectively result 
in changing the inverse temperature, i.e., the Lagrange multiplier rh. This 
effectively sets m = m+(p), i.e. to the highest value with non- negative en- 
tropy. The sub-optimal states with the lowest free energy are then those 
with m — m^(p). 

The central point in all decoding schemes, is to select the correct solution only 
on the basis of its magnetization. As long as there are no sub-optimal solutions 



1 This condition corresponds to the selection of an accurate prior within the Bayesian 
framework. 



with the same magnetization, this is in principle possible. As shown here, all 
three decoding schemes discussed above, manage to do so. To find whether at a 
given p there exists a gap between the magnetization of the correct solution and 
that of the nearest sub-optimal solution, just requires plotting A4(m)(>0) and 
mo(p), thus allowing a graphical determination of p c . Since MPM decoding is 
done at Nishimori's temperature, the simplest replica symmetry assumption is 
sufficient to describe the thermodynamically dominant state || . At p c the states 
with m + (p c ) = mo(pc) are thermodynamically dominant, and the p c values that 
we obtain under this assumption are exact. 



5 Critical noise level - results 



Some general comments can be made about the critical MAP (or typical set) 
values obtained via the annealed and quenched calculations. Since M q (m) < 
Ai a (m) (for given values of K, C and p), we can derive the general inequality 
Pc.q > Pea- For all K, C values that we have numerically analyzed, for both 
annealed and quenched cases, m+(p) is a non increasing function of p, and p c 
is unique. The estimates of the critical noise levels p c , a /q: based on A4 a / q , are 
obtained by numerically calculating m ca / q (p), and by determining their inter- 
section with mo(p). This is explained graphically in FigJ2(a). As the results for 
MPM decoding have already been presented elsewhere |l3], we will now concen- 
trate on the critical results p c obtained for typical set and MAP decoding; these 
are presented in Fig.||(b), showing the values of p c ^ a / q f° r various choices of K 
and C are compared with those reported in the literature. 

From the table it is clear that the annealed approximation gives a much more 
pessimistic estimate for p c . This is due to the fact that it overestimates M in 
the following way. A4 a (m) describes the combined entropy of n and n° as if n° 
were thermal variables as well. Therefore, exponentially rare events for n° (i.e. 
m(n°) /mo(p)) still may carry positive entropy due to the addition of a positive 
entropy term from n. In a separate study 1 14 these effects have been taken care 
of by the introduction of an extra exponent; this is not necessary in the current 
formalism as the quenched calculation automatically suppresses such contribu- 
tions. The similarity between the results reported here and those obtained in Q 
is not surprising as the equations obtained in quenched calculations are similar 
to those obtained by averaging the upper-bound to the reliability exponent using 
a methods presented originally by Gallager ||. Numerical differences between 
the two sets of results are probably due to the higher numerical precision here. 



6 Conclusions 



To summarize, we have shown that the magnetization enumerator A4(m) plays a 
central role in determining the achievable critical noise level for various decoding 
schemes. The formalism based on the magnetization enumerator M. offers a 
intuitively simple alternative to the weight enumerator formalism as used in 
typical pairs decoding [p|JT^|, but requires invoking the replica method given 



the very low critical values obtained by the annealed approximation calculation. 
Although we have concentrated here on the critical noise level for the BSC, both 
other channels and other quantities can also be treated in our formalism. The 
predictions for the critical noise level arc more optimistic than those reported 
in the IT literature, and are up to numerical precision in agreement with those 
reported in |M| . Finally, we have shown that the critical noise levels for typical 
pairs, MAP and MPM decoding must coincide, and we have provided an intuitive 
explanation to the difference between MAP and MPM decoding. 
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Fig. 1. The qualitative picture of A4(m) >0 (solid lines) for different values of p. 
For MAP, MPM and typical set decoding, only the relative values of m+(p) and 
m (p) determine the critical noise level. Dashed lines correspond to the energy 
contribution of — (3F at Nishimori's condition (f3 = 1). The states with the 
lowest free energy are indicated with •. a) Sub-critical noise levels p<p c , where 
m + {p) < mo(p), there are no solutions with higher magnetization than mo(p), and 
the correct solution has the lowest free energy, b) Critical noise level p—p c , where 
77?+ (p) = mo (p) . The minimum of the free energy of the sub-optimal solutions is 
equal to that of the correct solution at Nishimori's condition, c) Over-critical 
noise levels p> p c where many solutions have a higher magnetization than the 
true typical one. The minimum of the free energy of the sub-optimal solutions 
is lower than that of the correct solution. 
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Fig. 2. a) Determining the critical noise levels p c ,a/q based on the function 
M-a/qi a qualitative picture, b) Comparison of different critical noise level (p c ) 
estimates. Typical set decoding estimates have been obtained via the methods 
of IT ||, based on having a unique solution to W(m) = K(m,p c ), as well as 
using the methods of SP (lj] . The numerical precision is up to the last digit for 
the current method. Shannon's limit denotes the highest theoretically achievable 
critical noise level p c for any code jl5| . 



